Spectral normalization employing hidden Markov modeling of line spectrum pair frequencies

نویسندگان

Bryan L. Pellom

John H. L. Hansen

چکیده

This paper proposes a spectral normalization approach in which the acoustical qualities of an input speech waveform are mapped onto that of a desired neutral voice. Such a method can be e ective in reducing the impact of speaker variability such as accent, stress, and emotion for speech recognition. In the proposed method, the transformation is performed by modeling the temporal characteristics of the Line Spectrum Pair (LSP) frequencies of the neutral voice using hidden Markov models. The overall approach is integrated into a pitch synchronous overlap and add (PSOLA) analysis/synthesis framework. The algorithm is objectively evaluated using a distance measure based on the log-likelihood of observing the input (or normalized input) speech given Gaussian mixture speaker models for both the input and desired neutral voice. Results using the Gaussian mixture model formulated criteria demonstrate consistent normalization using a 10 speaker database.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tue.O5d.04 Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

This paper utilizes global variance (GV) of the log power spectrum (LPS) derived from mel-cepstrum to improve hidden Markov model (HMM) based parametric speech synthesis. In order to alleviate over-smoothing of the generated spectral structures, an LPS-GV modeling method using line spectral pairs (LSPs) has been proposed in our previous work, where the estimated distribution of LPS-GV was combi...

متن کامل

Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

متن کامل

Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis

In hidden Markov model-based speech synthesis, speech is typically parameterized using source-filter decomposition. A widely used analysis/synthesis framework, STRAIGHT, decomposes the speech waveform into a framewise spectral envelope and a mixed mode excitation signal. Inclusion of an aperiodicity measure in the model enables synthesis also for signals that are not purely voiced or unvoiced. ...

متن کامل

An HMM-Based Mandarin Chinese Text-To-Speech System

In this paper we present our Hidden Markov Model (HMM)-based, Mandarin Chinese Text-to-Speech (TTS) system. Mandarin Chinese or Putonghua, “the common spoken language”, is a tone language where each of the 400 plus base syllables can have up to 5 different lexical tone patterns. Their segmental and supra-segmental information is first modeled by 3 corresponding HMMs, including: (1) spectral env...

متن کامل

An improved model of masking effects for robust speech recognition system

Performance of an automatic speech recognition system drops dramatically in the presence of background noise unlike the human auditory system which is more adept at noisy speech recognition. This paper proposes a novel auditory modeling algorithm which is integrated into the feature extraction front-end for Hidden Markov Model (HMM). The proposed algorithm is named LTFC which simulates properti...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Spectral normalization employing hidden Markov modeling of line spectrum pair frequencies

نویسندگان

چکیده

منابع مشابه

Tue.O5d.04 Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis

An HMM-Based Mandarin Chinese Text-To-Speech System

An improved model of masking effects for robust speech recognition system

عنوان ژورنال:

اشتراک گذاری